• * 



WO 91/17243 PCT/DK91/00123 



47 



SEQUENCE LISTING 



10 



15 



20 



25 



30 



35 



INFORMATION: 
ICRNT: NOVO NQRDISK A/S, N N 
OF INVENTION: A Cellulase Preparation 
OF SEQUENCES: 4 
ADERESS: 

2: NOVO NQRDISK b/S 9 Patent Department 
P: Novo Alle 
£: Bagsvaerd 
DENMARK 
DK-2880 

(V) OCMPOTER liEAD&BLE FORM: 

(A) MEDICMvTXPE: Floppy disk. 

(B) OCMPOTER: IEM PC cxmpatible 

(C) OPERATING SYSTEM: FC-DOS/MS-DOS 

(D) SOFTWARE ryPatentln Release #1.0, Version #1.25 

(vi) CURRENT APPHCATJECN DATA: 

(A) APPLICATION 

(B) FILING DATE: 

(C) CEASSIFICATI< 

(Viii) ATTORNEY/AGENT 

(A) NAME: Thalsoe-Mai 



(ix) TELECOMMUNICATION 

(A) TELEEK3NE: +45 4444\8888 

(B) TELEFAX: +45 4449 3256 

(C) TELEX: 37304 




40 



45 



50 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACIERISnCS: 

(A) IENG3H: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cENA 
(iii) HYPOTHETICAL: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM. 1800 
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^(ix) FEATURE: 

(A) NAME/KEY: rat_peptide 

(B) LDGATION: 73.. 927 




NAMEJ/KEY: siqr peptide 
) IOCBTTCN: 10. .72 



w CDS 

(B) lbcskiatz 10. .927 



15 



(Xi) SEQUENCE 

GGATOCAAG ATG OCT 
Met Arg 
-21 -20 



ICN: SEQ ID NO:l: . 

\TOC COC CTC CDC COG TCC GCC GIT GTS GCC 
Pro Leu leu Pro Ser Ala Val Val Ala 
-15 -10 



20 GCC CTC COG GIG TTG GCC 
Ala Leu Pro Val leu Ala 
-5 

TOG GAC TCC TCC AAG OCT TOG 
25 Trp Asp Cys Cys Lys Pro Ser 
10 15 



GCC GCT GAT GGC ASG TCC ACC OGC TAC 
vAla Ala Asp Gly teg Ser Thr Arg Tyr 
1 5 

GGC TCG GCC AAG AAG GCT COC GIG 
\Gly Trp Ala Lys lys Ala Pro Val 
20 



AAG CAG OCT GTC TCP TCC TCC AAC 
Asn Gin Pro Val Hie Ser Cys Asn 
30 25 30 



Ala' 



35 



TTC GAC GCC AAG TOC GGC TCC GAG COG 
Hie Asp Ala lys Ser Gly Cys Glu Pro 
45 



AAC TTC CAG OCT ATC AOG GAC 
Hie can Arg lie Thr Asp 
35 40 

GDC GCC TAC TOG TCC 
Jly Val Ala Tyr ser Cys 
55 




GOG GAC CAG ACC OCA TCG GCT GTS AAC GAC 

Ala Asp Glii Thr Pro Trp Ala Val Asn Asp Asp 
60 65 



40 GOP GCC ACC TCT ATT GCC GGC AGC AAT GAG GOG GGCT 
Ala Ala Thr Ser lie Ala Gly Ser Asn Glu Ala Gly 
75 80 



TTC GOG CTC GCT TIT 
Ala Leu Gly Hie 
70 

TCC TCC GCC 
Cys Cys Ala 



TGC TAC GAG CTC ACC TIC ACA TOC GCT OCT GIT GCT GGC VaG AAG ATC 
45 Cys Tyr Glu Leu Thr Hie Thr Ser Gly Pro Val. Ala Gly £ys Iys Met 
90 95 100 

GTC GTC CAG TOC ACC AGC ACT GGC GCT GAT CUT GGC AGC AAc' 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp leu Gly Ser Asn 
50 105 110 us 

GAT CTC AAC ATC COC GGC GGC GGC GTC GGC ATC TTC GAC GGA TCC 
Asp Leu Asn lie Pro Gly Gly Gly Val Gly lie Hie Asp Gly Cys 
125 130 135 

55 



TTC 
Hie 
120 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 
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Pro Gin Hie GlV Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser 
140\ 145 150 



528 



5 OGC AAC GAG TCC G&T OGG TTC OOC GAC GOC CTC AAG OCC GGC TCC TAC 
Arg Asn Glu Cys AspSArg Hie Pro Asp Ala leu Iys Pro Gly Cys Tyr 
155 \ 160 165 



576 



TOG OGC TTC GAC TOG TTCNAAG AAC GOC GAC AAT CCG AGC TTC AGC TIC 
10 Trp Arg Hie Asp Trp Hie lys Asn Ala Asp Asn Pro Ser Hie Ser Hie 
170 17? 180 



OGT CAG GTC GAG TCC OCA GOC 
Arg Gin Val Gin Cys Pro Ala 
15 185 190 



OGC AAC GAC GAC GGC AAC TTC OCT 
Arg Asn Asp Asp Gly Asn Hie Pro 
205 



CTC GTC GCT OGC AOC GGA TGC OGC 
Leu Val Ala Arg Thr Gly Cys Arg 
195 200 



GTC CAG ATC OOC TCC AGC AGC 
Val Gin lie Pro Ser Ser Ser 
210 215 



20 



AOC AGC TCT CCG GTC AAC CAG OCT AC£ 
Thr Ser Ser Pro Val Asn Gin Pro Thr 
220 



25 TCC AOC AGC TOG AGC OGG CCA GTC CAG OCT 
Ser Thr Thr Ser Ser Pro Pro Val Gin Pro 
235 240 




AOC AGC AOC AGG TCC AOC 
Thr Ser Thr Thr Ser Thr 
230 



ACT OOC AGC GGC TCC 
Thr Pro Ser Gly Cys 
245 



ACT GCT GAG AGG TGG GCT CAG TCC GGC GGC AAT 
30 Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn 

250 255 26t 

AOC AOC TGC CTC GCT GGC AGC ACT TGC AOS AAG ATT 
Thr Thr O/s Val Ala Gly Ser Thr Cys Thr Lys lie 
35 265 270 275 

CAT CAG TGC CTC TAGAOGCAGG GCAGCEDGAG GGCCTTACTO 

His Gin cys leu 

285 



TGG AGC GGC TCC 
Trp Ser Gly Cys 



GAC TGG TAC 
Asp Trp Tyr 
280 



40 



OGAAATCACA CTCOCAATCA CTCZAITAGT TCTTGEACAT AATTIOCTCA 




624 



672 



720 



768 



816 



864 



912 



964 



1024 



GAITCTCACA TAAATCCAAT GAGGAACAAT GAGTAC 

45 



1060 
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(2) INK3RMATICN FOR SEQ ID NO: 2: 

&CE CHRRACEEE?ISTICS: 

I: 305 amino acids 
(B) taEE: amino acid 
(D) TOpoiOGY: linear 

(ii) MDIECDIE tf^PE: protein 

10 (xi) SEQUENCE DES'GKIPTION: SEQ ID NO:2: 

Met Arg Ser Ser Pro Leu iki Pro Ser Ala Val Val Ala Ala Leu Ere 
-21 -20 -15, -io 

15 Val leu Ala Lbu Ala Ala Asp &Ly Arg Ser Ttar Arg Tyr Trp Asp Cys 
-5 1 \ 5 io 




20 



Cys lys Pro Ser Cys GLy Trp Ma Lys lys Ala Pro Val Asn Gin Pro 
15 20 25 



Val Hie Ser Cys Asn Ala Asn Hie 
30 35 



GlnNArg He Thr Asp Hie Asp Ala 
40 



Lys Ser GLy Cys Glu Pro Gly GLy Val Ala Tyr Ser Cys Ala Asp Gin 
25 45 50 \ 55 

Thr Pro Trp Ala Val Asn Asp Asp Hie Ala Lsu GLy Hie Ala Ala Thr 
60 65 70\ 75 

30 Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys <^ys Ala Cys Tyr Glu 

80 85 \ 90 



35 



Leu Thr Hie Thr Ser Gly Pro Val Ala Gly Lys lys Met Val Val Gin 
95 100 \ 105 

Ser Thr Ser Thr GLy GLy Asp Lsu Gly Ser Asn His Hie^Asp Leu Asn 
110 115 120 ^ 



lie Pro GLy GLy Gly Val GLy lie Hie Asp Gly Cys Thr Pro Gin Hie 
40 125 130 135 \ 

Gly Gly Lsu Pro Gly Gin Arg Tyr GLy GLy lie Ser Ser Arg Asn Glu 
140 145 150 \ 155 

45 Cys Asp Arg Hie Pro Asp Ala leu lys Pro Gly Cys Tyr Trp Arg she 
160 165 170 1 



50 



Asp Trp Hie lys Asn Ala Asp Asn Pro Ser Hie Ser Hie Arg Gin Val\ 
175 180 185 

Gin Cys Pro Ala Glu Lsu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 
190 195 200 



Asp GLy Asn Hie Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser 
55 205 210 215 
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Pro Val Asn Gin Pro Ihr ser Hhr Ser Thr ihr Ser Ihr Ser dr Ihr 
220 225 \ • 230 235 



5 Ser Ser Pro Pro Val Gin Pro Bar Ihr Pro Ser Gly Cys Thr Ala Glu 
240 \ 245 250 



10 



Arg Trp Ala Gin cys Gly Gly Asn Gly Ttp Ser Gly cys ihr Thr cys 

255 260 265 

Val Ala Gly Ser Ihr Cys Ihr Lys lie ksn Asp Trp Tyr His Gin Cys 
270 275 \ 280 



leu 



15 
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:ON FOR SEQ ID NO: 3: 

tfCE CHARACTERISTICS: 
IENGOH: 1473 base pairs 
TXEE: nucleic acid 

(C) \ STOANEMHESS : single 

(D) XDOPQEDGST: linRar 



10 



15 



20 



25 




(Vi) ORIGINAL 
(A) 

(B) STRAIN: 



Fusarium axysporum 
2672 



(ix) FEATURE: 

(A) NAME/KEY: 

(B) IDCanCN: 97 \. 1224 



(xi) SEQUENCE DESCKIEEEC 



GAACTOGOGG CaSCTCATTC 



SEQ ID NO:3: 

TTCTTEAGAA TEACATACAC TCTCXTTCAA 



AACACTCACT CTIIAAACAA AAGAACITIT 



30 



CTC GCC CIG GCC GGC OCT CTC GOC GIG 
Lea Ala Lea Ala Gly Pro lea Ala Val 
10 15 



35 




CAC TCT ACT GGA TAG TOG GAT TGC TGC AAG 
His Ser Bit Arg Tyr Trp Asp. Cys Cys Lys 
25 30 



40 GGA AAG GCT GCT CTC AAC GOC OCT GCT TEA 
Gly lys Ala Ala Val Asn Ala Pro Ala Lsu Ihr 

40 45 \S0 



ATG OGA TCT TAC ACT CTT 
Met Arg Ser Tyr Bir Leu 
1 5 

GCT GCT TCT GGA AGO GCT 
Ala Ala Ser Gly Ser Gly 
20 



TCT TGC TCT TGG AGC 
Ser Cys Ser Trp Ser 
35 



GAT AAG AAC GAC 
Asp Lys Asn Asp 




AAC CCC AIT TGC AAC AGC AAT GCT CTC AAC GCT TGTt GAG GCT GCT GCT 
45 Asn Pro He Ser Asn Ohr Asn Ala Val Asn Gly cys\Glu Gly Gly Gly 
55 60 65 . \ 70 

TCT GCT TAT GCT TGC AGC AAC TAC TCT CCC TGG GCT CTC AAC GAT GAG 
Ser Ala Tyr Ala Cys Oir Asn Tyr Ser Pro Trp Ala Vai\Asn Asp Glu 
50 75 80 \ 85 

CTT GOC TAC GCT CTC GCT GCT AGC AAG ATC TOC GCT GGC TOC^ GAG GOC 
leu Ala Tyr Gly Phe Ala Ala Thr Lys lie Ser Gly Gly Ser\Glu Ala 
90 95 100 

55 



60 
114 

162 

210 

258 

306 

354 

402 
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AGC TGG TGC 
Ser Trp Cys 
105 



53 



gct tgc tat GCT ITS acc nc acc act ggc ccc gtc 
Ala Cys Tyr Ala leu Tfar Phe Thr Thr Gly Pro Val 
110 115 



5 AA6 GGC AAG AA6 
Lys Gly Lys Lys 
120 



GGC GAG AAC CAC TTC 
10 Gly Asp Asn His Phe 
135 



TTC GAC GGC TGC ACC OCT 
Hie Asp Gly Cys Thr Ser 
15 155 



ATC GTC CAG TCC ACC AAC ACT GGA GGT GAT CTC 
lie Val Gin Ser Thr Asn Thr Gly Gly Asp leu 
125 130 



CTC ATG ATG CCC GGC GGT GGT GTC GGT ATC 
leu Met Met Pro Gly Gly Gly Val Gly lie 
145 150 



TTC GGC AAG GCT CTC GGC GGT GCC CAG 
Ehe Gly lys Ala Leu Gly Gly Ala Gin 
160 165 



20 




TAG GGC GGT ATC TCC TCC CGA 
Tyr Gly Gly He Ser Ser Arg 
170 



CTC AAG GAC GGT TGC CAC TGG CGA 
Leu Lys Asp Gly Cys His Trp Arg 
185 190 



GAA TGT GAT AGC TAG CCC GAG CTT 
Glu Cys Asp Ser Tyr Pro Glu Leu 
.75 180 



GAC TGG TTC GAG AAC GCC GAC 
Asp Trp Hie Glu Asn Ala Asp 
195 



25 AAC OCT GAC TIC ACC TIT GAG CAG GOT 
Asn, Pro Asp Hie Thr Hie Glu Gin Val 
200 205 



TGC CCC AAG GCT CTC CTC 
Cys Pro Lys Ala Leu Leu 
210 



GAC ATC AGT GGA TGC AAG CGT GAT GAC GAC 
3 0 Asp He Ser Gly Cys Lys Arg Asp Asp Asp 
215 220 



AGC TTC OCT GCC TTC 
Ser Hie Pro Ala Hie 
230 



AAG GTT GAT ACC TCG GCC AGC AAG CCC CAG CCC TCC AGC TCC GCT AAG 
Lys Val Asp Thr Ser Ala Ser Lys Pro Gin Pro Ser Ser Ser Ala Lys 
35 235 240 \ 245 



450 



498 



546 



594 



642 



690 



738 



786 



834 



AAG ACC ACC TCC GCT GCT GCT GCC GCT CAG CCC CAG AAG ACC AAG GAT 
Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin Pro Gin Lys Thr Lys Asp 
250 255 \ 260 

40 

TCC GCT OCT GTT GTC CAG AAG TCC TCC ACC AAG OCT GCCNgCT CAG CCC 
Ser Ala Pro Val Val. Gin Lys Ser Ser Thr Lys Pro Ala Ala Gin Pro 
265 270 275 

45GAGCCTACTAAGCCCGCCGACAAGCa:CAGACCGACAAG CCTvGTC GCC 
Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin Thr Asp Lys Pro Val Ala 
280 285 290 



882 



930 



978 



ACC AAG OCT GCT GCT ACC AAG CCC GTC CAA OCT GTC AAC AAG CCCvAAG 
50 Thr Lys Pro Ala Ala Thr Lys Pro Val Gin Pro Val Asn Lys Pro> 
295 300 305 3.10 



1026 



ACA ACC CAG AAG GTC CGT GGA AGC AAA ACC CGA GGA AGC TGC COG 
Thr Thr Gin Lys Val Arg Gly Thr Lys Thr Arg Gly Ser Cys Pro AlaN 
55 315 320 325 



1074 
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'Si 

CO 



AAG ACT GAC GCT 
Lys Thr Asp Ala 
330 



54 



GCC AAG GCC TCC GTT 
La lys Ala Ser Val 
335 



5 TCT OCT GCT TCC AAG TCCNSCT TAT OOC AAC 
Cys Gly Gly Ser lys Ser Ala Tyr Pro Asn 
345 \ 350 

ACT GGA AGC AAG TCT GTC AAG GAG AAC GAG 
10 Thr Gly Ser lys cys Val Lys Gin Asn Glu 
360 365 

COC AAC TAAATGGEAG ATCCATCX3CT 
Pro Asn 
15 375 

TOCTCDCATC AGCAGGCTIG TCKEDCTATA 
TK3ITGTACA TACTATATCT TCATK3EATA 

20 

CGACAACTGG CTACAAAAGA CTTGGCAGGC TTSTTC 
AAAAAAAAAA AAAAAAAAA 



GTC OCT GCT TAT TAC CAG 
Val Pro Ala Tyr Tyr Gin 
340 

GGC AAC CTC GCT TGC GCT 
Gly Asn leu Ala cys Ala 
355 

TAC TAC TCC CAG TCT CTC 
Tyr Tyr Ser Gin cys Val 
370 

ACEATGOCTC TCAGAAGGGA 




CDGGBCCAAG TSECOGACOC 
CATAGATAGC CTCTTGTCAG 
TEGACACACT TECCTCCATA 



1122 

1170 

1218 

1274 

1334 
1394 
1454 
1473 



i 



4 
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POP SEQ ID NO:4: 

ICE CHM&CTE2U5TICS: 

: 376 amino acids 
: amino acid 
: linear 



10 



15 




(ii) MDLH3JIE\lYEE: protein 

(xi) SBQDENCE E^SCKEPTICN: SBQ ID NO: 4: 

Met Arg ser Tyr Thr LeuVLeu Ala Leu Ala Gly Pro Leu Ala Val ser 
1 5 \ 10 15 

Ala Ala Ser Gly Ser Gly His Ser Thr Arg Tyr Trp Asp Cys cys Lys 
20 \ 25 30 



Pro Ser Cys Ser Trp Ser Gly Ivs Ala Ala Val Asn Ala Pro Ala leu 
20 35 40 45 

Thr cys Asp lys Asn Asp Asn Pra lie Ser Asn Thr Asn Ala Val Asn 
50 55 \ 60 

25 Gly Cys Glu Gly Gly Gly Ser Ala Tor Ala Cys Thr Asn Tyr Ser Pro 
65 70 \ 75 80 



30 



Trp Ala Val Asn Asp Glu Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie 
85 TO_ 95 

ser Gly Gly Ser Glu Ala Ser Trp Cys Cys Ala Cys Tyr Ala Lsu Thr 
100 105 \ 110 



Phe Thr Thr Gly Pro Val Lys Gly Lys Lys Met lie Val Gin Ser Thr 
35 115 120 \ 125 

Asn Thr Gly Gly Asp Leu Gly Asp Asn His Pha\Asp Leu Met Met Pro 
130 135 

40 Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Set Glu Phe Gly Lys 
145 150 155 \ 160 



45 



Ala Lsu Gly Gly Ala Gin Tyr Gly Gly lie Ser Ser Arg Ser Glu Cys 
165 170 V 175 

Asp Ser Tyr Pro Glu Leu Leu Lys Asp Gly Cys His TrpVrg Phe Asp 
180 185 190 



1 



Trp Hie Glu Asn Ala Asp Asn Pro Asp Hie Thr Fhe Glu GlniVal Gin 
50 195 200 205 - A 

Cys Pro Lys Ala Leu Leu Asp lie Ser Gly Cys lys Arg Asp Asp Asp 
210 215 220 \ 

55 Ser Ser Hie Pro Ala Hie lys Val Asp Thr Ser Ala Ser lys Pro^Gln 
225 230 235 240 
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r 



Pro Ser Ser 




56 



Lys Lys Ihr Hir Ser Ala Ala Ala Ala Ala Gin 
45 250 255 



5 Pro Gin Lys Ihr Lys Asp Ser Ala Pro Val Val Gin lys Ser Ser Ihr 
260 \ 265 270 



10 



Lys Pro Ala Ala Gin Pro Glu Pro Ihr Lys Pro Ala Asp Lys Pro Gin 
275 \ 280 285 

Ihr Asp lys Pro Val Ala Ihr lys Pro Ala Ala r Ehr lys Pro Val Gin 
290 295 300 



Pro Val Asn lys Pro lys Thr\ 
15 305 310 



CSLn lys Val Arg Gly ihr Lys Ihr 
315 320 



Arg Gly Ser Cys Pro Ala lys Thf\Asp 
325 



Ala Ihr Ala lys Ala Ser Val 
330 335 



20 Val Pro Ala Tyr Tyr Gin Cys Gly Gly Ser lys Ser Ala Tyr Pro Asn 
340 345\ 350 



25 



Gly Asn leu Ala Cys Ala Ihr Gly Ser lys Cys Val lys Gin Asn Glu 
355 360 \ 365 

Tyr Tyr Ser Gin Cys Val Pro Asn 
370 375 



# 




SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) 



APPLICANT: Rasmussen, Grethe 



Mikkelsen, Jan Moller 
Schulein, Martin 
Patkar, Shankant A. 
Hagen, Fred 



(ii) TITLE OF INVENTION: A Cellulase Preparation Comprising 



(iii) NUMBER OF SEQUENCES: 33 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Novo Nordisk of North America, Inc. 

(B) STREET: 405 Lexington Avenue, 64th Floor 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: United States of America 

(F) ZIP : 10174-6401 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/389,423 

(B) FILING DATE: 14-FEB-1995 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Lambiris, Elias J. 

(B) REGISTRATION NUMBER: 33,728 

(C) REFERENCE/DOCKET NUMBER: 3469 . 214 -US 

(ix) TELECOMMUNICATION INFORMATION: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



Endoglucanase Enzyme 



(A) TELEPHONE: 212-867-0123 

(B) TELEFAX: 212-878-9655 



(iii) HYPOTHETICAL: NO 



<vi) ORIGINAL SOURCE : 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM 1800 

(ix) FEATURE: 

(A) NAME /KEY : matjpeptide 

(B) LOCATION: 73.. 924 



(ix) FEATURE: 

(A) NAME /KEY : sigjpeptide 

(B) LOCATION: 10. .72 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10. .924 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



GGATCCAAG ATG CGT TCC TCC CCC CTC CTC CCG TCC GCC GTT GTG GCC 
Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala 
-21 -20 -15 -10 



GCC CTG CCG GTG 
Ala Leu Pro Val 
-5 

TGG GAC TGC TGC 
Trp Asp Cys Cys 
10 



TTG GCC CTT GCC 
Leu Ala Leu Ala 



AAG CCT TCG TGC 
Lys Pro Ser Cys 
15 



GCT GAT GGC AGG 
Ala Asp Gly Arg 
1 

GGC TGG GCC AAG 
Gly Trp Ala Lys 
20 



TCC ACC CGC TAC 
Ser Thr Arg Tyr 
5 

AAG GCT CCC GTG 
Lys Ala Pro Val 



AAC CAG CCT GTC TTT TCC TGC AAC GCC AAC TTC CAG CGT ATC ACG GAC 
Asn Gin Pro Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp 
25 30 35 40 

TTC GAC GCC AAG TCC GGC TGC GAG CCG GGC GGT GTC GCC TAC TCG TGC 
Phe Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 55 



GCC GAC CAG ACC CCA TGG GCT GTG AAC GAC GAC TTC GCG CTC GGT TTT 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe 
60 65 70 



GCT GCC ACC TCT ATT GCC GGC AGC AAT GAG GCG GGC TGG TGC TGC GCC 
Ala Ala Thr Ser He Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala 
75 80 85 



TGC TAC GAG CTC ACC TTC ACA TCC GGT CCT GTT GCT GGC AAG AAG ATG 
Cys Tyr Glu Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met 
90 95 100 



GTC GTC CAG TCC ACC AGC ACT GGC GGT GAT CTT GGC AGC AAC CAC TTC 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe 
105 110 115 120 



GAT CTC AAC ATC CCC GGC GGC GGC GTC GGC ATC TTC GAC GGA TGC ACT 



Asp Leu Asn He Pro Gly Gly Gly Val Gly He Phe Asp Gly Cys Thr 
125 130 135 

CCC CAG TTC GGC GGT CTG CCC GGC CAG CGC TAC GGC GGC ATC TCG TCC 
Pro Gin Phe Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly He Ser Ser 
140 145 150 



CGC AAC GAG TGC GAT CGG TTC CCC GAC GCC CTC AAG CCC GGC TGC TAC 
Arg Asn Glu Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr 
155 160 165 



TGG CGC TTC GAC TGG TTC AAG AAC GCC GAC AAT CCG AGC TTC AGC TTC 
Trp Arg Phe Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe 
170 175 180 

CGT CAG GTC CAG TGC CCA GCC GAG CTC GTC GCT CGC ACC GGA TGC CGC 
Arg Gin Val Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg 
185 190 195 200 



CGC AAC GAC GAC GGC AAC TTC CCT GCC GTC CAG ATC CCC TCC AGC AGC 
Arg Asn Asp Asp Gly Asn Phe Pro Ala Val Gin He Pro Ser Ser Ser 

205 210 215 

ACC AGC TCT CCG GTC AAC CAG CCT ACC AGC ACC AGC ACC ACG TCC ACC 
Thr Ser Ser Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr 
220 225 230 

TCC ACC ACC TCG AGC CCG CCA GTC CAG CCT ACG ACT CCC AGC GGC TGC 
Ser Thr Thr Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys 
235 240 245 



ACT GCT GAG AGG TGG GCT CAG TGC GGC GGC AAT GGC TGG AGC GGC TGC 
Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys 
250 255 260 



ACC ACC TGC GTC GCT GGC AGC ACT TGC ACG AAG ATT AAT GAC TGG TAC 
Thr Thr Cys Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr 
265 270 275 280 

CAT CAG TGC CTG TAGACGCAGG GCAGCTTGAG GGCCTTACTG GTGGCCGCAA 
His Gin Cys Leu 



CGAAATGACA CTCCCAATCA CTGTATTAGT TCTTGTACAT AATTTCGTCA TCCCTCCAGG 



GATTGTCACA TAAATGCAAT GAGGAACAAT GAGTAC 



(2) INFORMATION FOR SEQ ID NO:2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 05 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



6ff 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala Ala Leu Pro 

-21 -20 -15 -10 

Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 
-5 15 10 

Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val Asn Gin Pro 
15 20 25 

Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp Phe Asp Ala 
30 35 40 

Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Gin 
45 50 55 

Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe Ala Ala Thr 
60 65 70 75 

Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 
80 85 90 

Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Gin 
95 100 105 

Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe Asp Leu Asn 
110 115 120 

lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Pro Gin Phe 
125 130 135 

Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser Arg Asn Glu 
140 145 150 155 

Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr Trp Arg Phe 
160 165 170 

Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe Arg Gin Val 
175 180 185 

Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 
190 195 200 

Asp Gly Asn Phe Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser 
205 210 215 

Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr 
220 225 230 235 

Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu 
240 245 250 



Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys 
255 260 265 



(X 



Val Ala Gly Ser Thr Cys Thr Lys lie Asn Asp Trp Tyr His Gin Cys 
270 275 280 

Leu 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Fusarium oxysporum 

(B) STRAIN: DSM 2672 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 97 .. 1224 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

GAATTCGCGG CCGCTCATTC ACTTCATTCA TTCTTTAGAA TTACATACAC TCTCTTTCAA 

AACAGTCACT CTTTAAACAA AACAACTTTT GCAACA ATG CGA TCT TAC ACT CTT 

Met Arg Ser Tyr Thr Leu 
1 5 

CTC GCC CTG GCC GGC CCT CTC GCC GTG AGT GCT GCT TCT GGA AGC GGT 
Leu Ala Leu Ala Gly Pro Leu Ala Val Ser Ala Ala Ser Gly Ser Gly 
10 15 20 

CAC TCT ACT CGA TAC TGG GAT TGC TGC AAG CCT TCT TGC TCT TGG AGC 
His Ser Thr Arg Tyr Trp Asp Cys Cys Lys Pro Ser Cys Ser Trp Ser 
25 30 35 

GGA AAG GCT GCT GTC AAC GCC CCT GCT TTA ACT TGT GAT AAG AAC GAC 
Gly Lys Ala Ala Val Asn Ala Pro Ala Leu Thr Cys Asp Lys Asn Asp 
40 45 50 

AAC CCC ATT TCC AAC ACC AAT GCT GTC AAC GGT TGT GAG GGT GGT GGT 
Asn Pro lie Ser Asn Thr Asn Ala Val Asn Gly Cys Glu Gly Gly Gly 
55 60 65 70 



TCT GCT TAT GCT TGC ACC AAC TAC TCT CCC TGG GCT GTC AAC GAT GAG 
Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro Trp Ala Val Asn Asp Glu 
75 80 85 



CTT GCC TAC GGT TTC GCT GCT ACC AAG ATC TCC GGT GGC TCC GAG GCC 
Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie Ser Gly Gly Ser Glu Ala 
90 95 100 



402 



AGC TGG TGC TGT GCT TGC TAT GCT TTG ACC TTC ACC ACT GGC CCC GTC 
Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr Phe Thr Thr Gly Pro Val 
105 110 115 



450 



AAG GGC AAG AAG ATG ATC GTC CAG TCC ACC AAC ACT GGA GGT GAT CTC 
Lys Gly Lys Lys Met lie Val Gin Ser Thr Asn Thr Gly Gly Asp Leu 
120 125 130 



498 



GGC GAC AAC CAC TTC GAT CTC ATG ATG CCC GGC GGT GGT GTC GGT ATC 
Gly Asp Asn His Phe Asp Leu Met Met Pro Gly Gly Gly Val Gly lie 
135 140 145 150 



546 



a 



m 



""4 



TTC GAC GGC TGC ACC TCT GAG TTC GGC AAG GCT CTC GGC 
Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys Ala Leu Gly 
155 160 

TAC GGC GGT ATC TCC TCC CGA AGC GAA TGT GAT AGC TAC 
Tyr Gly Gly lie Ser Ser Arg Ser Glu Cys Asp Ser Tyr 
170 175 

CTC AAG GAC GGT TGC CAC TGG CGA TTC GAC TGG TTC GAG 
Leu Lys Asp Gly Cys His Trp Arg Phe Asp Trp Phe Glu 
185 190 195 

AAC CCT GAC TTC ACC TTT GAG CAG GTT CAG TGC CCC AAG 
Asn Pro Asp Phe Thr Phe Glu Gin Val Gin Cys Pro Lys 
200 205 210 



GGT GCC CAG 
Gly Ala Gin 
165 

CCC GAG CTT 
Pro Glu Leu 
180 

AAC GCC GAC 
Asn Ala Asp 



GCT CTC CTC 
Ala Leu Leu 



594 



642 



690 



738 



p 



GAC ATC AGT GGA TGC AAG CGT GAT GAC GAC TCC AGC TTC CCT GCC TTC 
Asp lie Ser Gly Cys Lys Arg Asp Asp Asp Ser Ser Phe Pro Ala Phe 
215 220 225 230 



786 



AAG GTT GAT ACC TCG GCC AGC AAG CCC CAG CCC TCC AGC TCC GCT AAG 
Lys Val Asp Thr Ser Ala Ser Lys Pro Gin Pro Ser Ser Ser Ala Lys 
235 240 245 



834 



AAG ACC ACC TCC GCT GCT GCT GCC GCT CAG CCC CAG AAG ACC AAG GAT 
Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin Pro Gin Lys Thr Lys Asp 
250 255 260 



882 



TCC GCT CCT GTT GTC CAG AAG TCC TCC ACC AAG CCT GCC GCT CAG CCC 
Ser Ala Pro Val Val Gin Lys Ser Ser Thr Lys Pro Ala Ala Gin Pro 
265 270 275 



930 



GAG CCT ACT AAG CCC GCC GAC AAG CCC CAG ACC GAC AAG CCT GTC GCC 
Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin Thr Asp Lys Pro Val Ala 

280 285 290 

ACC AAG CCT GCT GCT ACC AAG CCC GTC CAA CCT GTC AAC AAG CCC AAG 
Thr Lys Pro Ala Ala Thr Lys Pro Val Gin Pro Val Asn Lys Pro Lys 
295 300 305 310 



978 



1026 



ACA ACC CAG AAG GTC CGT GGA ACC AAA ACC CGA GGA AGC TGC CCG GCC 
Thr Thr Gin Lys Val Arg Gly Thr Lys Thr Arg Gly Ser Cys Pro Ala 
315 320 325 



AAG ACT GAC GCT ACC GCC AAG GCC TCC GTT GTC CCT GCT TAT TAC CAG 
Lys Thr Asp Ala Thr Ala Lys Ala Ser Val Val Pro Ala Tyr Tyr Gin 
330 335 340 

TGT GGT GGT TCC AAG TCC GCT TAT CCC AAC GGC AAC CTC GCT TGC GCT 
Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn Gly Asn Leu Ala Cys Ala 
345 350 355 

ACT GGA AGC AAG TGT GTC AAG CAG AAC GAG TAC TAC TCC CAG TGT GTC 
Thr Gly Ser Lys Cys Val Lys Gin Asn Glu Tyr Tyr Ser Gin Cys Val 
360 365 370 

CCC AAC TAAATGGTAG ATCCATCGGT TGTGGAAGAG ACTATGCGTC TCAGAAGGGA 

Pro Asn 

375 

TCCTCTCATG AGCAGGCTTG TCATTGTATA GCATGGCATC CTGGACCAAG TGTTCGACCC 
TTGTTGTACA TAGTATATCT TCATTGTATA TATTTAGACA CATAGATAGC CTCTTGTCAG 
CGACAACTGG CTACAAAAGA CTTGGCAGGC TTGTTCAATA TTGACACAGT TTCCTCCATA 
AAAAAAAAAA AAAAAAAAA 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Arg Ser Tyr Thr Leu Leu Ala Leu Ala Gly Pro Leu Ala Val Ser 
15 10 15 

Ala Ala Ser Gly Ser Gly His Ser Thr Arg Tyr Trp Asp Cys Cys Lys 
20 25 30 

Pro Ser Cys Ser Trp Ser Gly Lys Ala Ala Val Asn Ala Pro Ala Leu 
35 40 45 

Thr Cys Asp Lys Asn Asp Asn Pro lie Ser Asn Thr Asn Ala Val Asn 
50 55 60 

Gly Cys Glu Gly Gly Gly Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro 
65 70 75 80 



Trp Ala Val Asn Asp Glu Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie 



85 



90 



95 



Ser Gly Gly Ser Glu Ala Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr 
100 105 110 



Phe Thr Thr Gly 
115 

Asn Thr Gly Gly 
130 

Gly Gly Gly Val 
145 

Ala Leu Gly Gly 



Asp Ser Tyr Pro 
180 

Trp Phe Glu Asn 
195 

Cys Pro Lys Ala 
210 

Ser Ser Phe Pro 
225 

Pro Ser Ser Ser 



Pro Gin Lys Thr 
260 

Lys Pro Ala Ala 
275 

Thr Asp Lys Pro 
290 

Pro Val Asn Lys 
305 

Arg Gly Ser Cys 



Val Pro Ala Tyr 
340 

Gly Asn Leu Ala 
355 



Pro Val Lys Gly 
120 

Asp Leu Gly Asp 
135 

Gly lie Phe Asp 
150 

Ala Gin Tyr Gly 
165 

Glu Leu Leu Lys 



Ala Asp Asn Pro 
200 

Leu Leu Asp lie 
215 

Ala Phe Lys Val 
230 

Ala Lys Lys Thr 
245 

Lys Asp Ser Ala 



Gin Pro Glu Pro 
280 

Val Ala Thr Lys 
295 

Pro Lys Thr Thr 
310 

Pro Ala Lys Thr 
325 

Tyr Gin Cys Gly 



Cys Ala Thr Gly 
360 



Lys Lys Met lie 

Asn His Phe Asp 
140 

Gly Cys Thr Ser 
155 

Gly lie Ser Ser 
170 

Asp Gly Cys His 
185 

Asp Phe Thr Phe 



Ser Gly Cys Lys 
220 

Asp Thr Ser Ala 
235 

Thr Ser Ala Ala 
250 

Pro Val Val Gin 
265 

Thr Lys Pro Ala 



Pro Ala Ala Thr 
300 

Gin Lys Val Arg 
315 

Asp Ala Thr Ala 
330 

Gly Ser Lys Ser 
345 

Ser Lys Cys Val 



Val Gin Ser Thr 
125 

Leu Met Met Pro 



Glu Phe Gly Lys 
160 

Arg Ser Glu Cys 
175 

Trp Arg Phe Asp 
190 

Glu Gin Val Gin 
205 

Arg Asp Asp Asp 



Ser Lys Pro Gin 
240 

Ala Ala Ala Gin 
2.55 

Lys Ser Ser Thr 
270 

Asp Lys Pro Gin 
285 

Lys Pro Val Gin 



Gly Thr Lys Thr 
320 

Lys Ala Ser Val 
335 

Ala Tyr Pro Asn 
350 

Lys Gin Asn Glu 
365 



Tyr Tyr Ser Gin Cys Val Pro Asn 
370 375 



# 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
AGCTGCGGCC GCAGGCCGCG GAGGCCA 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 
AGCTTGGCCT CCGCGGCCTG CGGCCGC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 
AATTCGCGGC CGCGGCCATG GAGGCC 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 



AATTGGCCTC CATGGCCGCG GCCGCG 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
AAYGCYGACA AAYCC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
AACGAYGAYG GNAAYTTCCC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
AAYGAYTGGT ACCAYCARTG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
GCGCCAGTAG CAGCCGGGCT TGAGGG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
ACGTCTCAAC TCGGATCCAA GATGCGTT 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
CTCAACTCTG ATCAAGATGC GTTCC 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TGTCGACCAG TAAGGCCCTC AAGCTG 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 





(A) 
(B) 



LENGTH: 3 0 base pairs 
TYPE: nucleic acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16 
GACAGAGCAC AGAATTCACT AGTGAGCTCT 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
TGGGAYTGYT GYAARCC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
AGGGAGACCG GAATTCTGGG AYTGYTGYAA RCC 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 



CCNGGNGGNG GNGTNGG 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
AGGGAGACCG GAATTCCCNG GNGGNGGNGT NGG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21 
ACNAYCATNK TYTTNCC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22 
GACAGAGCAC AGAATTCACN AYCATNKTYT TNCC 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 




NGGRTTRTCN GCNKYYTYRA ACCA 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GACAGAGCAC AGAATTCNGG RTTRTCNGCN KYYTYRAACC A 



(2) INFORMATION FOR SEQ ID NO: 25: 

O 

ifl (i) SEQUENCE CHARACTERISTICS: 

: ) (: J (A) LENGTH: 4 2 base pairs 

iy (B) TYPE: nucleic acid 

j'pj (C) STRANDEDNESS: single 

•! ! (D) TOPOLOGY: linear 

|ft (ii) MOLECULE TYPE: cDNA 

n (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

fU GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG TA 

M 
W 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC TG 



•'SB* 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
TGTACGCATG TAACATTA 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTGCACAATA TTTCAAGC 

lQ 

Ci (2) INFORMATION FOR SEQ ID NO: 29: 

! s p (i) SEQUENCE CHARACTERISTICS: 

]fl (A) LENGTH: 42 base pairs 

^ (B) TYPE: nucleic acid 

f& (C) STRANDEDNESS: single 

H (D) TOPOLOGY: linear 
is 

U (ii) MOLECULE TYPE: cDNA 

ru 

j a £ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

^ GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG TA 

Q 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC TG 



(2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 





(B) 
(C) 



TYPE: nucleic acid 
STRANDEDNESS : single 



C3 
& 
N 
W 

in 

N 
CO 

ru 

w 
d 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31 
AGCTTCTCAA GGACGGTT 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 
AACAAGGGTC GAACACTT 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
CCAGAAGACC AAGGATT 



